Analysis & Experimental Results ∗

نویسنده

  • George Karypis
چکیده

In recent years we have seen a tremendous growth in the volume of text documents available on the Internet, digital libraries, news sources, and company-wide intranets. Automatic text categorization, which is the task of assigning text documents to pre-specified classes (topics or themes) of documents, is an important task that can help both in organizing as well as in finding information on these huge resources. Text categorization presents unique challenges due to the large number of attributes present in the data set, large number of training samples, and attribute dependencies. In this paper we focus on a simple linear-time centroid-based document classification algorithm, that despite its simplicity and robust performance, has not been extensively studied and analyzed. Our extensive experiments show that this centroid-based classifier consistently and substantially outperforms other algorithms such as Naive Bayesian, k-nearest-neighbors, and C4.5, on a wide range of datasets. Our analysis shows that the similarity measure used by the centroid-based scheme allows it to classify a new document based on how closely its behavior matches the behavior of the documents belonging to different classes, as measured by the average similarity between the documents. This matching allows it to dynamically adjust for classes with different densities. Furthermore, our analysis shows that the similarity measure of the centroid-based scheme accounts for dependencies between the terms in the different classes. We believe that this feature is the reason why it consistently outperforms other classifiers that cannot take these dependencies into account.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Finite Element Analysis and Experimental Investigation on the Conventional and Vibration Assisted Drilling

In this research, finite element analysis of the conventional drilling and vibration assisted drilling is carried out. The ABAQUS software is employed for FE analysis. The Johnson-Cook models for both plastic deformation and damage are employed for FE simulation.The results of the FE analysis are then verified with experimental results in both conventional and vibration assisted drilling on Al ...

متن کامل

Finite Element Analysis and Experimental Investigation on the Conventional and Vibration Assisted Drilling

In this research, finite element analysis of the conventional drilling and vibration assisted drilling is carried out. The ABAQUS software is employed for FE analysis. The Johnson-Cook models for both plastic deformation and damage are employed for FE simulation.The results of the FE analysis are then verified with experimental results in both conventional and vibration assisted drilling on Al ...

متن کامل

Experimental and Numerical Investigation of Air Temperature Distribution inside a Car under Solar Load Condition

In this work both experimental and numerical analysis are carried out to investigate the effect of solar radiation on the cabin air temperature of Maruti Suzuki Celerio car parked for 90 min under solar load condition. The experimental and numerical analysis encompasses on temperature increment of air at various locations ins...

متن کامل

Experimental and 3D Finite Element Analysis of a Slotless Air-Cored Axial Flux PMSG for Wind Turbine Application

In this research paper, the performance of an air-cored axial flux permanent magnet synchronous generator is evaluated for low speed, direct drive applications using 3D finite element modeling and experimental tests. The structure of the considered machine consists of double rotor and coreless stator, which results in the absence of core losses, reduction of stator weight and elimination of cog...

متن کامل

Modal analysis of a turbo-pump shaft: An innovative suspending method to improve the results

Modal parameter extraction of high speed shafts is of critical importance in mechanical design of turbo-pumps. Due to the complex geometry and peripheral components of turbo-pumps, difficulties can arise in determination of modal parameters. In this study, modal properties of a turbo-pump shaft, was studied by experimental modal analysis, and using different excitation techniques. An innovative...

متن کامل

Vibration of cracked plate using differential quadrature method and experimental modal analysis

In this study, the vibration of cracked plates is investigated using the differential quadrature method and experimental modal analysis. The crack, which is assumed to be open, is modeled by the extended rotational spring. With its finite length, the crack divides the plate into six segments. Then, the differential quadrature is applied to the governing differential equations of motion and the ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2000